Data-driven clustered hierarchical tandem system for LVCSR
نویسندگان
چکیده
In tandem systems, the outputs of multi-layer perceptron (MLP) classifiers have been successfully used as features for HMM-based automatic speech recognition. In this paper, we propose a data-driven clustered hierarchical tandem system that yields improved performance on a large-vocabulary broadcast news transcription task. The complicated global learning for a large monolithic MLP classifier is divided into simpler tasks, in which hierarchical structures clustered based on the outputs of a monolithic MLP are used to alleviate phone confusion. The proposed approach yields error rate reductions of up to 16.4% over MFCC features alone.
منابع مشابه
Hierarchical processing of the modulation spectrum for GALE Mandarin LVCSR system
This paper aims at investigating the use of TANDEM features based on hierarchical processing of the modulation spectrum. The study is done in the framework of the GALE project for recognition of Mandarin Broadcast data. We describe the improvements obtained using the hierarchical processing and the addition of features like pitch and short-term critical band energy. Results are consistent with ...
متن کاملHierarchical neural networks feature extraction for LVCSR system
This paper investigates the use of a hierarchy of Neural Networks for performing data driven feature extraction. Two different hierarchical structures based on long and short temporal context are considered. Features are tested on two different LVCSR systems for Meetings data (RT05 evaluation data) and for Arabic Broadcast News (BNAT05 evaluation data). The hierarchical NNs outperforms the sing...
متن کاملCross-lingual and multi-stream posterior features for low resource LVCSR systems
We investigate approaches for large vocabulary continuous speech recognition (LVCSR) system for new languages or new domains using limited amounts of transcribed training data. In these low resource conditions, the performance of conventional LVCSR systems degrade significantly. We propose to train low resource LVCSR system with additional sources of information like annotated data from other l...
متن کاملAnalysis and Comparison of Recent MLP Features for LVCSR Systems
MLP based front-ends have evolved in different ways in recent years beyond the seminal TANDEM-PLP features. This paper aims at providing a fair comparison of these recent progresses including the use of different long/short temporal inputs (PLP,MRASTA,wLP-TRAPS,DCT-TRAPS) and the use of complex architectures (bottleneck, hierarchy, multistream) that go beyond the conventional three layer MLP. F...
متن کاملACID/HNN: clustering hierarchies of neural networks for context-dependent connectionist acoustic modeling
We present the ACID/HNN framework, a principled approach to hierarchical connectionist acoustic modeling in large vocabulary conversational speech recognition (LVCSR). Our approach consists of an Agglomerative Clustering algorithm based on Information Divergence (ACID) to automatically design and robustly estimate Hierarchies of Neural Networks (HNN) for arbitrarily large sets of context-depend...
متن کامل